Handy statistical lexicon
These are all important methods and concepts related to statistics that are not as well known as they should be. I hope that by giving them names, we will make the ideas more accessible to people:
Mister P: Multilevel regression and poststratification.
The Secret Weapon: Fitting a statistical model repeatedly on several different datasets and then displaying all these estimates together.
The Superplot: Line plot of estimates in an interaction, with circles showing group sizes and a line showing the regression of the aggregate averages.
The Folk Theorem: When you have computational problems, often there’s a problem with your model.
The Pinch-Hitter Syndrome: People whose job it is to do just one thing are not always so good at that one thing.
Weakly Informative Priors: What you should be doing when you think you want to use noninformative priors.
P-values and U-values: They’re different.
Conservatism: In statistics, the desire to use methods that have been used before.
The Backseat Driver Principle: Even if the advice or criticism is annoying, it makes sense to listen.
WWJD: What I think of when I’m stuck on an applied statistics problem.
Theoretical and Applied Statisticians, how to tell them apart: A theoretical statistician calls the data x, an applied statistician says y.
The Fallacy of the One-Sided Bet: Pascal’s wager, lottery tickets, and the rest.
Alabama First: Howard Wainer’s term for the common error of plotting in alphabetical order rather than based on some more informative variable.
The USA Today Fallacy: Counting all states (or countries) equally, forgetting that many more people live in larger jurisdictions, and so you’re ignoring millions and millions of Californians if you give their state the same space you give Montana and Delaware.
Second-Order Availability Bias: Generalizing from correlations you see in your personal experience to correlations in the population.
The “All Else Equal” Fallacy: Assuming that everything else is held constant, even when it’s not gonna be.
The Self-Cleaning Oven: A good package should contain the means of its own testing.
The Taxonomy of Confusion: What to do when you’re stuck.
The Blessing of Dimensionality: It’s good to have more data, even if you label this additional information as “dimensions” rather than “data points.”
Scaffolding: Understanding your model by comparing it to related models.
Ockhamite Tendencies: The irritating habit of trying to get other people to use oversimplified models.
Bayesian: A statistician who uses Bayesian inference for all problems even when it is inappropriate. I am a Bayesian statistician myself.
Multiple Comparisons: Generally not an issue if you’re doing things right but can be a big problem if you sloppily model hierarchical structures non-hierarchically.
Taking a Model Too Seriously: Really just another way of not taking it seriously at all.
God is in Every Leaf of Every Tree: No problem is too small or too trivial if we really do something about it.
As They Say in the Stagecoach Business: Remove the padding from the seats and you get a bumpy ride.
Story Time: When the numbers are put to bed, the stories come out.
The Foxhole Fallacy: There are no X’s in foxholes (where X = people who disagree with me on some issue of faith).
The Pinocchio Principle: A model that is created solely for computational reasons can take on a life of its own.
The Statistical Significance Filter: If an estimate is statistically significant, it’s probably an overestimate.
Arrow’s Other Theorem (weak form): Any result can be published no more than five times.
Arrow’s Other Theorem (strong form): Any result will be published five times.
The Ramanujan Principle: Tables are read as crude graphs.
The Paradox of Philosophizing: If philosophy is outlawed, only outlaws will do philosophy.
Defaults: What statistics is the science of.
Default, the greatest trick it ever pulled: Convincing the world it didn’t exist.
The Methodological Attribution Problem: The many useful contributions of a good statistical consultant, or collaborator, will often be overly attributed to the statistician’s methods or philosophy.
The John Yoo Line: The point at which nothing you write gets taken seriously, and so you might as well become a hack because you have no scholarly reputation remaining.
The Chris Rock Effect: Some graphs give the pleasant feature of visualizing things we already knew, shown so well that we get a shock of recognition, the joy of relearning what we already know, but seeing it in a new way that makes us think more deeply about all sorts of related topics.
The Freshman Fallacy: Just because a freshman might raise a question, that does not make the issue irrelevant.
The Garden of Forking Paths: Multiple comparisons can be a problem, even when there is no “fishing expedition” or “p-hacking” and the research hypothesis was posited ahead of time.
The One-Way Street Fallacy: Considering only one possibility of a change that can go in either direction.
The Pluralist’s Dilemma: How to recognize that my philosophy is just one among many, that my own embrace of this philosophy is contingent on many things beyond my control, while still expressing the reasons why I prefer my philosophy to the alternatives (at least for the problems I work on).
More Vampirical Than Empirical: Those hypotheses that are unable to be killed by mere evidence. (from Jeremy Freese)
Statistical Chemotherapy: It slightly poisons your key result but shifts an undesired result above the .05 threshold. (from Jeremy Freese)
Tell Me What You Don’t Know: That’s what I want to ask you.
Salad Tongs: Not to be used for painting.
The Edlin Factor: How much you should scale down published estimates.
Kangaroo: When it is vigorously jumping up and down, don’t use a bathroom scale to weigh a feather that is resting loosely in its pouch.
The Speed Racer Principle: Sometimes the most interesting aspect of a scientific or cultural product is not its overt content but rather its unexamined assumptions.
Uncertainty Interval: Say this instead of confidence or credible interval.
What would you do if you had all the data?: Rubin’s first question.
What were you doing before you had any data?: Rubin’s second question.
The Time-Reversal Heuristic: How to think about a published finding that is followed up by a careful preregistered replication.
The Status-Reversal Heuristic: When you’re evaluating a published claim by someone in a high-status profession or a high-status institution, imagine it was coming from someone of lower status.
Clarke’s Law: Any sufficiently crappy research is indistinguishable from fraud.
The wedding, never about the marriage: With scientific journals, what it’s all about.
The problem with peer review: The peers.
The “What does not kill my statistical significance makes it stronger” fallacy: The belief that statistical significance is particularly impressive when it was obtained under noisy conditions.
Reverse Poe: It’s evidently sincere, yet its contents are parodic.
The (Lance) Armstrong Principle: If you push people to promise more than they can deliver, they’re motivated to cheat.
The Chestertonian Principle: Extreme skepticism is a form of credulity.
The most important aspect of a statistical method: not what it does with the data but rather what data it uses.
The Pandora Principle: Once you’ve considered a possible interaction or bias or confounder, you can’t un-think it.
The Paradox of Influence: Anticipated influence becomes valueless if you end up saying whatever it takes to keep it.
Cantor’s Corner: Where you want to be.
Correlation: It does not even imply correlation.
The Javert Paradox: Suppose you find a problem with published work. If you just point it out once or twice, the authors of the work are likely to do nothing. But if you really pursue the problem, then you look like a Javert.
Eureka bias: When you think you made a discovery and then you don’t want to give it up, even if it turns out you interpreted your data wrong.
A picture plus 1000 words: Better than two pictures or 2000 words.
The Piranha Problem: These large effects can’t all coherently coexist.
The Australia principle: Build the parts of the model you need, as you need them.
Just because something is counterintuitive: Doesn’t mean it’s true.
Honesty and transparency: They’re not enough.
Breadcrumbs: I need that trail.
Random in: Random out.
16: You need this much more of a sample size to estimate an interaction that is half the size of a main effect.
The Horse: Keep beating it; it’s never really dead.
The 80% Power Lie: None of this should be a surprise.
The Causal Identification Kool-Aid: The attitude by which any statistically significant difference is considered to represent some true population effect, as long as it is associated with a randomized treatment assignment, instrumental variable analysis, or regression discontinuity.
Strongest-link Fallacy: The idea that a chain of reasoning is as strong as its strongest link.
Evidence and Truth: They’re different.
Assumptions and Rigor: You can’t have one without the other.
Deference Trap: Without meta-science, what we get caught in.
Looking for a needle in a pile of needles: They’re not looking for a needle in a haystack, they’re . . .
Research Incumbency Rule: Once an article is published in some approved venue, it is taken as truth. Criticisms which would absolutely derail a submission in pre-publication review can be brushed aside if they are presented after publication.
Mathematical Simplicity: Not always the same as conceptual simplicity.
Cheeseburger snarfing diet gurus: When we teach, we don’t practice what we preach.
A spoonful of jargon with our factoids: Serves a similar function as the stuff in the toothpaste that gives you that tingly feeling when you brush: it has no direct function, but it conveys that it’s doing something.
A bunch of hoops to jump through: Research methods and statistics, as viewed my many researchers.
Hullman’s Theorem: Any experimental measure of graphical perception will inevitably not measure what it’s intended to measure.
McBroom’s Law: Sources must lose credibility when it is shown they promote falsehoods, even more when they never take accountability for those falsehoods.
I know there are a bunch I’m forgetting; can youall refresh my memory, please? Thanks.
P.S. No, I don’t think I can ever match Stephen Senn in the definitions game.
31 Comments